An international collaborative family-based whole genome quantitative trait linkage scan for myopic refractive error.

Purpose To investigate quantitative trait loci linked to refractive error, we performed a genome-wide quantitative trait linkage analysis using single nucleotide polymorphism markers and family data from five international sites. Methods Genomic DNA samples from 254 families were genotyped by the Center for Inherited Disease Research using the Illumina Linkage Panel IVb. Quantitative trait linkage analysis was performed on 225 Caucasian families and 4,656 markers after accounting for linkage disequilibrium and quality control exclusions. Two refractive quantitative phenotypes, sphere (SPH) and spherical equivalent (SE), were analyzed. The SOLAR program was used to estimate identity by descent probabilities and to conduct two-point and multipoint quantitative trait linkage analyses. Results We found 29 markers and 11 linkage regions reaching peak two-point and multipoint logarithms of the odds (LODs)>1.5. Four linkage regions revealed at least one LOD score greater than 2: chromosome 6q13–6q16.1 (LOD=1.96 for SPH, 2.18 for SE), chromosome 5q35.1–35.2 (LOD=2.05 for SPH, 1.80 for SE), chromosome 7q11.23–7q21.2 (LOD=1.19 for SPH, 2.03 for SE), and chromosome 3q29 (LOD=1.07 for SPH, 2.05 for SE). Among these, the chromosome 6 and chromosome 5 regions showed the most consistent results between SPH and SEM. Four linkage regions with multipoint scores above 1.5 are near or within the known myopia (MYP) loci of MYP3, MYP12, MYP14, and MYP16. Overall, we observed consistent linkage signals across the SPH and SEM phenotypes, although scores were generally higher for the SEM phenotype. Conclusions Our quantitative trait linkage analyses of a large myopia family cohort provided additional evidence for several known MYP loci, and identified two additional potential loci at chromosome 6q13–16.1 and chromosome 5q35.1–35.2 for myopia. These results will benefit the efforts toward determining genes for myopic refractive error.

were initially identified through quantitative trait linkage analysis using refractive error measurements (SPH and SE) as the traits. For example, the MYP7-MYP10 loci were identified using 221 dizygotic (DZ) twins from a classical twin study designed to investigate the heritability of refractive error. The strongest linkage signal was at chromosome 11p13, which contained the biologically relevant candidate gene paired box gene 6 (PAX6). In contrast to the twin study design, the MYP14 locus was a QTL found after 49 multigenerational Ashkenazi Jewish families ascertained for common myopia using SEM as the trait were analyzed [22]. No evidence of linkage to this region was found in the authors' efforts to replicate this finding in a metaanalysis consisting of Old Order Amish, African American, and Caucasian subjects [23]. However, in a later effort to finemap the MYP14 locus in a combined cohort of Old Order Amish and Ashkenazi Jewish families, replication of this locus was accomplished, and the QTL was narrowed to a 10 Mb area extending from chromosome 1p34.2 to chromosome 1p35.2 [24].
As was the case with localizing MYP14, successful efforts to find QTLs for refractive error have often used homogenous populations in genome-wide linkage studies. These homogenous populations have included Ashkenazi Jews [22][23][24][25], Caucasians [23,26], African Americans [23,26], and Old Order Amish [23,27], and different loci have been identified. For instance, MYP14 was not found in Caucasians, and 4q21 and 12q24 were found in Caucasians but not Ashkenazi Jews and Old Order Amish [21,22,28]. Overall, the QTL studies to date used either a small number of multigenerational pedigrees or twins. A more comprehensive and large-scale approach will help verify the existing regions.
This study is an international collaborative effort combining high-grade myopia pedigrees from five sites, leading to the largest family data set for myopia to date. We performed a genome-wide quantitative trait linkage scan using SPH and SEM directly and compared the results to known myopia loci. The outcomes of this study provide us with additional information regarding known genetic loci, and we have identified new myopia loci.

METHODS
Patients and families: As previously described [29], 254 multiplex families (at least two affected individuals per family) consisting of 1,411 subjects (47% male) were ascertained independently at five international sites. Collaboration in this retrospective analysis occurred after each program had already recruited subjects. For the quantitative analysis presented here, we analyzed the largest subset of this data, which comprised 225 Caucasian families and 1,168 subjects (47% male). Participating centers included Cardiff University in the UK (CARD), Duke University Medical Center in the United States (DUK), the Kennedy Institute of National Eye Clinic at Hellerup, Denmark (HEL), the University of Melbourne in Australia (MEL), and Toulouse University in France (TOU). Before recruiting subjects, all study sites obtained the appropriate institutional review board human subjects research study approvals. All principles of the Declaration of Helsinki were adhered to. Individuals were not included in the study if they had any ocular disease or insult that could predispose to myopia, such as retinopathy of prematurity, retinal dystrophy, corneal keratopathy, and any genetic syndromes that include myopia as a clinical phenotypic component. Licensed ophthalmologists or optometrists conducted complete eye examinations on all consenting subjects. At each study site, subjects filled out clinical and family history questionnaires. SPH and SEM quantitative phenotypes were obtained for each individual.
Sample preparation and genotyping: All subject genomic DNA samples were cataloged at the Duke University Center for Human Genetics (Duke CHG) and genotyped at the National Institutes of Health Center for Inherited Disease Research (CIDR). Most samples were extracted from blood (77.36%), while the remaining samples were derived from buccal mucosa (22.16%) or saliva specimens (0.48%). The genotyping platform used in this study was the Illumina Linkage Panel IVb consisting of 6,008 genome-wide single nucleotide polymorphisms (SNPs; Illumina). Following the CIDR genotyping protocol, each 96-well DNA sample plate Quality control and data cleaning: Following CIDR genotyping protocol, several quality control measures were implemented to determine the final set of markers released to the Duke CHG for analysis [29]. A total of 5,928 SNPs were released by the CIDR for data analysis. Of these markers, 5,903 had GeneCall scores (a measure of how close a genotype is to the center of the cluster of other samples assigned to the same genotypes) greater than 0.15, and the 5,903 markers were taken forward for analysis in the pedigrees.
To examine family relationships using RELPAIR [30,31] and PREST [32], 700 markers with approximately equal inter-marker distances across the genome were selected. All family relationship errors were subsequently corrected, and the PEDCHECK software program [33] was used to again check for Mendelian inconsistencies. When we found inconsistencies, we followed one of two options: 1) we assigned the missing genotypes within a family for the members directly involved in the Mendelian inconsistencies, or 2) we dropped an individual from further analysis if the family designations were ambiguous and therefore could not be reconciled.
All SNPs were tested for the Hardy-Weinberg equilibrium (HWE). Two data sets with unrelated samples were formed in which one affected individual sample per family was randomly selected to cluster within a designated affected group and one unaffected individual sample per family was selected to add to a designated unaffected group. The HWE was then assessed using an exact test implemented in the Genetic Data Analysis (GDA) program [34]. Using 3,200 permutations (the default setting for the GDA program), we estimated empirical p values for each marker. We used the Q-VALUE program to correct for multiple testing [35]. A marker with a q value less than 0.2 was declared a significant deviation from the HWE and was excluded in the linkage analysis. At this point, 5,744 markers remained. To account for linkage disequilibrium among SNPs, we used LDSelect to determine tagging SNPs for analysis with an r 2 cutoff of 0.16 [36]. Subsequently, we identified 4,656 tagging SNPs that comprised the final marker set that we defined for linkage analysis.
Whole genome quantitative trait linkage analysis: For this analysis, we investigated two phenotypes (SPH and SE) that were computed as the binocular average of measurements of the right and left eyes. These binocular averages for SPH and SEM were used directly for the quantitative analysis. When individuals have great discrepancies in SPH or SEM measurements between the eyes, this averaging can result in a muted linkage signal. For our data set, however, most individuals had similar SPH and SEM values for both eyes. Which refractive error parameter is the "correct" one to use for refractive error genetics study is unknown; therefore, we chose to use both measurements in our analyses. SPH and SEM are surrogates for axial length-a metric that was not consistently obtained for participants. Of the 1,004 individuals with SPH measurements available for both eyes, only 37, or 3.7%, could be classified as having one normal eye and one moderately myopic eye (−1.00 OD ≤ SPH≤−3.00 OD), and only two, or 0.2%, could be classified as having one normal or moderately myopic eye and one highly myopic eye (SPH≤−6.00 OD). For SE, the percentages of different myopia classifications between the two eyes were 2.1% and 0.3%, respectively. Individuals who were missing a measurement in at least one eye were assigned as missing for the phenotype of interest. We used the Sequential Oligogenic Linkage Analysis Routines (SOLAR) variance components procedure to perform quantitative trait linkage analysis using 4,656 tagging SNPs across the genome [37]. Under this procedure, the quantitative phenotypes SPH and SEM were defined as linear functions of the n QTLs (ri) that influence the trait: where X is a matrix of covariates and β is the regression coefficient matrix for the covariates. Parameters that comprise the likelihood function of SPH or SEM include the identical by descent (IBD) probability for a marker linked to a QTL, the additive genetic variance attributable to the unobserved QTL (σ 2 q), and other variance components. Using a likelihood-ratio test, SOLAR tests a null hypothesis of no linkage (σ 2 q=0). SOLAR also calculates a logarithm of the odds (LOD) score as evidence of linkage to an individual marker for two-point analysis or to an imputed chromosomal position for multipoint analysis. Before calculating the likelihood, locus-specific IBD information was computed for all pairs of relatives [37]. Following the premise of the Fulker et al. [38] method, the SOLAR multipoint mapping strategy uses the map distance between markers to compute the IBD information for a pair of relatives at a QTL linked to a marker. For our analysis, we used a Kosambi sex-averaged map obtained from deCode [39]. The factor representing ascertainment sites was included as a house effect in our model to adjust for ascertainment bias. CentiMorgans=cM.
That is, potential differences across ascertainment centers were investigated in our analysis by treating center as a random effect in the variance component model. We originally included sex as a covariate in the polygenic model, but dropped it from the final model due to a lack of significance. We examined the normality of phenotypic distribution before the linkage analysis using a Q-Q plot to meet the underlying assumption of the variance component model. We investigated several transformations, and viewed plots of log-transformed data in Q-Q plots. To meet the criterion of normality, we used the following formulas: SPHnew=4.

RESULTS
Of 254 families typed by the CIDR, the largest subset comprised 225 Caucasian families. This subset included 1,168 subjects. SPH data for only one eye were available for 164 subjects, and 256 subjects had SEM data for only one eye. These individuals were therefore not included in quantitative phenotype SPH or SEM analytical assessments. Summary descriptive statistics information for the samples included in the QTL analysis is provided in Table 1.
The heritability of SPH or SEM was highly significant (p=0.0001298 and p=0.0000006 for SPH and SE, respectively), with heritabilities of 23.2% (SPH) and 33.9% (SE). We used a threshold level of LOD≥1.50 to highlight promising initial linkage regions of interest for two-point and multipoint linkage analyses ( Table 2 and Table 3) [40].
Quantitative trait loci linkage regions for the sphere phenotype: Four markers across two chromosomes resulted in two-point LOD scores ≥ 1.5 (Table 2). Graphical results for the genome-wide linkage scan are depicted in Figure 1, and promising linked markers and linkage regions are summarized in Table 2 and Table 3. The rs2000203 at 87.57 centiMorgans (cM; two-point LOD=1.76) and rs1457947 at 89.14 cM (twopoint LOD=1.52) on chromosome 6 are in close proximity to each other. These markers also fall within a linkage region identified by the Caucasian subset in our data when treating SPH as a binary trait [29]. However, none of these four markers are near known MYP loci reported previously ( Table  2). Five linkage regions on separate chromosomes contained peak multipoint LOD scores ≥ 1.5 (Table 3). Figure 2 depicts a clear multipoint peak for chromosome 5q35.1-5q35.2 extending from 180 to 192 cM with a maximum LOD of 2.05, which is supported by rs4868073 (183.33 cM) with a twopoint LOD score of 1.27. The second highest multipoint peak is on chromosome 6q13-6q16.1 (peak LOD=1.96, 88 cM), which showed a LOD score over 2 when the SEM was analyzed (see next section). Other regions of interest include chromosome 1p31.1 (peak multipoint LOD=1.69 at 100 cM), chromosome 10p11.21-10q11.22 (peak multipoint LOD=1.68 at 64 cM), and chromosome 5p13.3 (peak multipoint LOD=1.51 at 49 cM; Figure 2). The chromosome 1p31.1 locus is close to the MYP14, located at 1p36, and the 5p13.3 locus is near a known myopia locus, MYP16, located at 5p15.33-5p15.2.
Quantitative trait loci linkage regions for the spherical equivalent phenotype: The linkage analyses of the SEM derived more markers and linkage regions of interest. There are 26 markers across 12 chromosomes that resulted in twopoint LOD scores ≥1.5, and eight linkage regions from eight chromosomes that contained peak multipoint LOD scores ≥1.5 (Table 2 and Table 3, and Figure 3). Similar to the SPH phenotype, the strongest evidence of linkage for the SEM occurred at chromosome 6q13-6q16.1 (peak two-point LOD=2.13 at rs1179900 and peak multipoint LOD=2.18 at 95 cM) and chromosome 5q35.1-5q35.2 (peak two-point LOD=1.62 at 187.27 cM and peak multipoint LOD=1.80 at 188 cM). In particular, the chromosome 6q13-16.1 region is strongly supported by a set of seven markers with two-point LOD scores ranging from 1.19 to 2.13 ( Table 2). The overall view of the two-point and multipoint linkage results for chromosome 6 is depicted in Figure 4.

DISCUSSION
We report a large-scale refractive error quantitative trait linkage study that used dense SNP genotyping as opposed to microsatellite markers. In this study, we identified two loci of high interest linked to myopic refractive error: (1) chromosome 6q13-6q16.1 with multipoint peak LOD scores  (2) chromosome 5q35.1-5q35.2 with multipoint peak LOD scores of 2.05 for SPH and 1.80 for SEM. The consistency of the LOD score significance derived from two different but similar measures of refractive error, SPH and SE, for these two loci. The chromosome 5q35.1-35.2 locus overlaps with a region that we reported previously using high myopia as a qualitative disease phenotype [29]. The chromosome 6q13-6q16.1 locus has not been previously reported as a linkage region for myopia in the literature.
Overall, our LOD scores are low (not achieving the standard significant threshold LOD ≥3) [41]. This is expected as we determined a relatively low residual heritability (24%-25%) of refractive error for our data set. Furthermore, since quantitative traits likely result from a confluence of factors including the involvement of several polymorphic genes and environmental conditions, the signal at a single QTL will not appear as strong given other factors acting on the trait. Despite these limitations, the pattern of our results still provides a good overview of refractive error loci across the genome.
In conducting our analysis, we recognized that an investigation of axial length, a major determinant of axial myopia, in our data might be insightful. Unfortunately, axial length data were collected for fewer than a third of our study participants (total sample size=266 compared to 1,004 and 912 for SPH and SE, respectively). Quantitative trait linkage analysis was conducted for the axial length data set, but no two-point or multipoint results exceeded a LOD threshold of 1.5. The power to detect a linkage signal for axial length was severely hampered by the low sample size.
In our previous report of linkage regions for high-grade myopia, the linkage regions were largely inconsistent between the analyses of the disease states defined by the SPH and the SEM [29]. When SPH and SEM are analyzed as quantitative traits, consistent results were found. This is expected, as the spreads of the SPH and SEM distributions are similar (SD: ±5.66 [SPH] versus ±5.57 [SE]). Most importantly, this shows the advantage of using quantitative traits in analyses. Comparatively, dichotomizing a distribution to a binary variable tends to lose power and adds phenotype state The present study provides important information. First, two new loci on chromosomes 5 and 6 are likely to be new chromosomal regions that link to myopic refractive errors. Second, our linkage analysis again replicated suggestive significance of the MYP3 locus on chromosome 12q. Clearly, this locus should undergo additional scrutiny in other data sets. Third, our analysis underscores the benefits of using refractive error as a quantitative trait for linkage, and demonstrates the advantage of using an SNP-based linkage screening protocol with higher marker density than conventional microsatellite markers to map new loci. The results of the present study of a large family data set using high-density SNPs for linkage scanning should aid in triaging candidate genes and loci for future genome-wide association studies and deep sequencing efforts.